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Convergence problems occur abundantly in all branches of mathematics or in the math- 
^^ ' ematical treatment of the sciences. Sequence transformations are principal tools to 

fH , overcome convergence problems of the kind. They accomplish this by converting a 

slowly converging or diverging input sequence {s„}J^o into another sequence {sJj}^o 

a with hopefully better numerical properties. Pade approximants, which convert the 

partial sums of a power series to a doubly indexed sequence of rational functions, are 
the best known sequence transformations, but the emphasis of the review will be on 
alternative sequence transformations which for some problems provide better results 
ji^ ■ than Pade approximants. 



S 



Abstract 



1 Introduction 

Many numerical techniques as for example iterative schemes, discretization methods, per- 
turbation techniques, or series expansions produce results which are actually sequences. 
Obviously, a numerical technique of that kind is practically useful only if the resulting se- 
quence converges sufficiently fast. Unfortunately, it frequently happens that the resulting 
sequence either converges too slowly to be practically useful, or it may even diverge. 

Problems with slow convergence or divergence were of course already encountered in the 
early days of calculus. Accordingly, numerical techniques for the acceleration of convergence 
or the summation of divergent series are almost as old as calculus itself. According to 
Knopp M p. 249], the first series transformation was published by Stirling [pi already 
in 1730, and in 1755 Euler [H published the series transformation which now bears his 
name. In rudimentary form, convergence acceleration methods are even older. In a book 
by Brezinski [0, pp. 90 - 91] it is mentioned that convergence acceleration methods were 
already used in 1654 by Huygens and in 1674 by Seki Kowa, the probably most famous 
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Japanese mathematician of his time. Both Huygens and Seki Kowa tried to obtain better 
approximations to n. Huygens used a linear extrapolation scheme which is a special case of 
what we now call Richardson extrapolation 0, and Seki Kowa used the so-called A^ process, 
which is usually attributed to Aitken [|6|. Then, in a book by Liem, Lii, and Shih ||7|, p. ix] it 
is mentioned that extrapolation methods were already used by the Chinese mathematicians 
Liu Hui (A.D. 263) and Zhu Chongzhi (429 - 500) for obtaining better approximations to 
TT, but no further details are given. 

Sequence transformations are principal tools to overcome convergence problems of the 
kind mentioned above. In this approach, a slowly convergent or divergent sequence {s„}5^q, 
whose elements may for instance be the partial sums 



n 

E 

fc=0 



ak (1.1) 



of an infinite series, is converted into a new sequence {sJjJJ^q with hopefully better numerical 
properties. The history of sequence transformations and related topics starting from the 
17th century until today is discussed by Brezinski in a monograph ^ or in two articles |§|,|S|]. 
Before the invention of electronic computers, mainly linear sequence transformations 
were used, which compute the elements of the transformed sequence {s^jJJ^o ^^ weighted 
averages of the elements of the input sequence {sn}^o according to 

n 

Ki = E l^^kSk- (1.2) 

fc=0 

The theoretical properties of these matrix transformations are now very well understood 



[H,M-|l9]. Their main appeal lies in the fact that for the weights /i„fc in (1.2) some nec- 
essary and sufficient conditions could be formulated which guarantee that the application 
of such a matrix transformation to a convergent sequence {sn}^o yields a transformed 
sequence {s'„}^o converging to the same limit s — Soo- Theoretically, this regularity is 
extremely desirable, but from a practical point of view, it is a disadvantage. This sounds 
paradoxical. However, Wimp remarks in the preface of his book ||l5|, p. X] that the size of 
the domain of regularity of a transformation and its efficiency seem to be inversely related. 
Accordingly, regular linear transformations are in general at most moderately powerful, and 
the popularity of most linear transformations has declined considerably in recent years. 

Modern nonlinear sequence transformations as for instance Wynn's epsilon um and 
rho ^^ algorithm or Brezinski's theta algorithm ||l^ have largely complementary properties: 
They are nonregular, which means that that the convergence of the transformed sequence 
to the correct limit is not guaranteed. In addition, their theoretical properties are far 
from being completely understood. Nevertheless, they often accomplish spectacular results. 
Consequently, nonlinear transformations now dominate both mathematical research as well 
as practical applications, as documented by the large number of recent books ||7|,[l5|,[l9|-|2^ 
and review articles [p7|-p9| on this topic. 

There is considerable evidence that the culprit for the frequently unsatisfactory perfor- 
mance of regular matrix transformation is not their linearity, but their regularity. In |3Ct] 
it was shown that suitably chosen linear but nonregular transformations can at least for 
special problems be as efficient as the most powerful nonlinear transformations. 

The best known class of sequence transformations are Pade approximants which convert 
the partial sums 

n 
fc=0 
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of a (formal) power series for some function / into a doubly indexed sequence of rational 
functions 

[l/m]f{z) = Pi{z)/Q^{z), ;,meNo. (1.4) 

Here, Pi{z) = po + piz + . . . + piz'' and Qm{z) = 1 + qiz + . . . + qmz"^ are polynomials in z 
of degrees I and m, respectively. The I + m + I polynomial coefficients pq, pi, . . . , pi and 
(7i, (?2, . . . , ^m are chosen in such a way that the Taylor expansion of the ratio Pi{z)/Qm{z) 
at z = agrees with the power series for / as far as possible: 

f{z) ~ Pi{z)/Q^{z) - 0(z'+™+i), z^O. (1.5) 

This asymptotic error estimate leads to a system of Z + to + 1 linear equations for the coef- 
ficients of the polynomials Pi{z) and Qm{z) [^,^. Moreover, several recursive algorithms 
are known. The merits and weaknesses of the various computational schemes for Pade 
approximants are discussed in [e3. Section II. 3]. 

In applied mathematics and in theoretical physics, Pade approximants have become 
the standard tool to overcome convergence problems with power series. Accordingly, there 
is an extensive literature, and any attempt to provide a complete bibliography would be 
beyond the scope of this article (see for example the extensive bibliography compiled by 
Brezinski |33| ) . The popularity of Pade approximants in theoretical physics can be traced 
back to an article by Baker |Q, who also wrote the first modern monograph |Q. The 
currently most complete treatment of the theory of Pade approximants is the 2nd edition of 
the monograph by Baker and Graves-Morris |3^ . More condensed treatments can be found 
in books on continued fractions |35|-|38[| , in a book by Bender and Orszag on mathematical 
physics [^ Section 8], or in a book by Baker on critical phenomena [^ Part III]. Then, 
there is a book by Pozzi on the use of Pade approximants in fluid dynamics |4l[| as well as 
several review articles |42|-[45|| . The generalization of Pade approximants to operators and 
multivariate power series is discussed in a book by Cuyt |Q and in articles by Cuyt [|^ and 
by Guillaume and A. Huard ||48| . Another generalization of Pade approximants - Pade- type 
approximants - are described in a book by Brezinski pl| . 

The emphasis of this article is not on Pade approximants, which are well known as 
well as extensively documented in the literature, but on alternative sequence transforma- 
tions, which are not so well known yet. This is quite undeserved. In some cases, sequence 
transformations outperform Pade approximants. For example, the present author has ap- 
plied sequence transformations successfully in such diverse fields as the evaluation of special 



functions pS , 30 , 49|-p3[ , the evaluation of molecular multicenter integrals of exponentially 
decaying functions |54|-|5q | , the summation of strongly divergent quantum mechanical per- 
turbation expansions Ea^R9H7Q], and the extrapolation of crystal orbital and cluster cal- 
culations for oligomers to their infinite chain limits of stereoregular quasi-onedimensional 
organic polymers p^-uJ. In the majority of these applications, it was either not possible 
to use Pade approximants, or sequence transformations did a better job. 

It is of course impossible to present something as complex and diverse as sequence trans- 
formations in a single and comparatively short article. Because of obvious space limitations, 
this article can at best provide some basic facts about some of the most useful sequence 
transformations and review the recent literature. However, it cannot be a substitute for 
more detailed treatments as for example an older review of the author |g8[ or the book by 
Brezinski and Redivo Zaglia p5| . 

2 On the Construction of Sequence Transformations 

The basic step for the construction of a sequence transformation is the assumption that the 
elements of a convergent or divergent sequence {snjj^o '^^^ ^^ partitioned into a limit or 
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antilimit s and a remainder r„ according to 



s + r„ 



n e No. 



(2.1) 



If a sequence {sn}5^o converges, the remainders r„ in {2A) can be made negligible by 
increasing n as much as necessary. However, many sequences converge so slowly that this 
is not feasible. Moreover, increasing n does not help in the case of divergence. 

Alternatively, one can try to improve convergence by computing approximations to the 
remainders which are then eliminated from the sequence elements. Of course, this is more 
easily said than done. The remainders {r„}^g of a sequence {sn}^o ^^^ i'^ general un- 
known, and their determination is normally not easier than the determination of the (gener- 
alized) limit. For example, if the input data {s„}J^o are the partial sums (1.1) of an infinite 
series, the remainders satisfy 



k=n+l 



a-k 



(2.2) 



Thus, the straightforward elimination of exact remainders is normally not possible. However, 
the remainders of some infinite series can be approximated with the help of the Eulcr- 
Maclaurin formula (see for example [TJ, pp. 279 - 295]). Let us consider a convergent 
infinite series X]iy=o5'(^)' ^'^'^ ^^^ ^^ assume that its terms g{v) are smooth and slowly 
varying functions of the index v. Then, the integral 



N 



g{x) dx 



M 



with M, N ^'L provides a good approximation to the sum 

\g{M) -f g(Af + 1) + . . . + ,g(iV - 1) + ig(iV) 



(2.3) 



(2.4) 



and vice versa. In the years between 1730 and 1740, Euler and Maclaurin derived inde- 
pendently correction terms, which ultimately yielded what we now call the Euler-Maclaurin 
formula: 



J2 ffH = / 9{x)dx + -[g{M)+g{N)] 



E 



s. 



2j 



-[ (2J)! 



g(2^-i)(7V)-.g(2j"-i)(M)l +Rk{g), 



Rk{g) 



{2k)\ 



N 



M 



B2k{x~lx\)g^'"'\x)Ax. 



(2.5a) 

(2.5b) 
S,„(0) is a 



Here, \x\ is the integral part of x, B„i{x) is a Bernoulli polynomial , and B.„ 
Bernoulli number. 

If we set M = n + I and N — oo, the leading terms of the Euler-Maclaurin formula, 
which is actually an asymptotic series, yield for sufficiently large n rapidly convergent ap- 
proximations to the truncation error X]i/=n+i 5(^)- 

An example, which demonstrates the usefulness of the Euler-Maclaurin formula, is the 
Dirichlet series for the Riemann zeta function: 



C(^) 



j:(-+^y 



(2.6) 



i/=0 
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This series converges for Re(z) > 1. However, it is notorious for extremely slow convergence 
if Re(z) is only slightly larger than one. 



The terms (v + l) ^ of the Dirichlet series (2^) are obviously smooth and slowly varying 



functions of the index v and they can be differentiated and integrated easily. Thus, we can 



apply the Euler-Maclaurin formula (2^) with M = n + I and A^ = oo to the tnmcation 
error of the Dirichlet series: 

+ E %^:^ (" + 2)-^-2^+i + Rk{n, z) , (2.7a) 



z-*— n+1 



,.^ (2^)! 



{Z)2k r B2k{x- ixl) 

(2fc)! i„+i [l + xY 



Mn,z) = -'^ ;:)'i^ <ix. (2.7b) 



Here, {z)„i = z{z + 1) • • • (z + m — 1) = r(z + ra)/T{z) is a Pochhammer symbol. 

In p^. Tables 8.7 and 8.8, p. 380] or in |^, Section 2] it was shown that a few terms of 



the Euler-Maclaurin approximation (2.7) suffice for a convenient and reliable computation 



of C(z) even if z is so close to one. The arguments z = 1.1 and z ~ 1.01 considered 



in |39|,[73| lead to such a slow convergence of the Dirichlet series (2^) that its evaluation via 
a straightforward addition of its terms is practically impossible. 

Thus, it looks like an obvious idea to use the Euler-Maclaurin formula routinely in the 
case of slowly convergent series. Unfortunately, this is not possible. The Euler-Maclaurin 
formula requires that the terms of the series can be differentiated and integrated with respect 
to the index. This excludes many series of interest. Moreover, the Euler-Maclaurin formula 
cannot be applied in the case of alternating or divergent series since their terms are neither 
smooth nor slowly varying. However, the probably worst drawback of the Euler-Maclaurin 
formula is that it is an analytic convergence acceleration method. This means that it cannot 
be applied if only the numerical values of the terms of a series are known. 

Sequence transformations also try to compute approximations to the remainders and 
to eliminate them from the sequence elements. However, they require only relatively little 
knowledge about the n-dependence of the remainders of the sequence to be transformed. 
Consequently, they can be applied in situations in which apart from the numerical values of 
a finite string of sequence elements virtually nothing else is known. 

Since it would be futile to try to eliminate a remainder r„ with a completely unknown and 
arbitrary n-dependence, a sequence transformation has to make some assumptions, either 
implicitly or explicitly. Therefore, a sequence transformation will only work well if the actual 
behavior of the remainders is in sufficient agreement with the assumptions made. Of course, 
this also implies that a sequence transformation, which makes certain assumptions, may 
fail to accomplish something if it is applied to a sequence with remainders of a sufficiently 
different behavior. Thus, the efhciency of a sequence transformation for certain sequences 
and its inefficiency or even nonregularity for other sequences are intimately related. 

Sequence transformations normally eliminate only approximations to the remainders. 
In such a case, the elements of the transformed sequence {s^}^q will also be of the type 



of (2.1), which means that s^ can also be partitioned into the (generalized) limit s and a 



transformed remainder r,Jj according to 

4 = s -K r; , n e No . (2.8) 

The transformed remainders {j'^j^o ^-re normally different from zero for all finite values of 
n. However, convergence is accelerated if the transformed remainders {r^jj^g vanish more 
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rapidly than the original remainders {f„}5^Q, 



J^r, '^ 1 . ' -n 



lim -^ = lim ^ = 0, (2.9) 

and a divergent sequence is summed if the transformed remainders rjj vanish as n — > oo. 

Assumptions about the n-dependence of the truncation errors can be incorporated into 
the transformation process via model sequences. In this approach, a sequence transformation 
Ej^' is constructed in such a way that it produces the (generalized) limit s of a model 
sequence 

Sn ^ S + fn ^ S + ^^ Cj (j);j{n) , (2-10) 

if it is applied to a set of fc + 1 consecutive elements of this model sequence: 

^(„) ^ E'^^'\snrSn+l.--- ,Sn+k) = S. (2.11) 

The 4>j{n) are assumed to be known functions of n, and the Cj are unspecified coefficients. 

The elements of this model sequence contain fc + 1 unknowns, the limit or antilimit s and 

the k coefficients cq, ci, . . . , Ck-i- Since all unknowns occur linearly, Cramer's rule implies 



that a sequence transformation, which is able to determine the limit or antilimit s in (2.1C) 
from the numerical values of fc + 1 sequence elements s„, s„+i, . . . , Sn+fc, is given as the ratio 
of two determinants |2^, p. 56]. Since determinantal representations are computationally 
unattractive, alternative computational schemes for E^' are highly desirable. A recursive 

scheme for E)^' was derived independently by Schneider [Q , Brezinski [|76|, and Havie [Q . 
A more economical implementation was later obtained by Ford and Sidi |7q ]. 

The sequence transformation E^ contains the unspecified quantities (j)j{n). The ma- 
jority of all currently known sequence transformations can be obtained by specializing the 
(pjin) (see for example [ p5| , pp. 57 - 58]). So, from a purely formal point of view either the 

(n) 

determinantal representation for E^. or the recursive schemes provide a complete solution 
to the majority of all convergence acceleration and summation problems. However, even the 
recursive scheme of Ford and Sidi ||78| is considerably more complicated than the recursive 



schemes for other transformations that can be obtained by specializing the 4>j{n) in (2.10) 



in) 

So, is it usually simpler to use instead of E^, one of its special cases. Important is also 
the following aspect: It is certainly helpful to know that for arbitrary functions 4>j{n) the 

in) 

sequence transformation E^, can be computed, but it is more important to find out which 
set {'/>j(ri)}°^o produces the best results for a given sequence {s„}^q. 



If the remainders of the model sequence (2.1C) are capable of producing sufficiently 



accurate approximations to the remainders of a sequence {s„}5^q, then the apphcation of the 

(n) 

sequence transformation Ej. to fc + 1 sequence elements s„, s„+i, . . . , s„+fc should produce 
a sufficiently accurate approximation to the (generalized) limit s of the input sequence. This 
is usually the case if the truncation errors can be expressed as infinite series in terms of the 
4'j{n). Further details as well as many examples can be found in [P5| , p8| . 

Simple asymptotic conditions are used in the literature to classify the type of convergence 
of a sequence. For example, many practically relevant sequences {sn}^o converging to some 
limit s can be characterized by the asymptotic condition 

lim f!^±l^i = p^ (2.12) 

n^oo Sn — S 

which closely resembles the ratio test in the theory of infinite series. If \p\ < 1, the sequence 
is called linearly convergent, and if p = 1, it is called logarithmically convergent. 
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Typical examples of linearly convergent sequences are the partial sums of a power series 
with a nonzero, but fini te r adius of convergence. In contrast, the partial sums X)iy=o(^^"-'^)~^ 
of the Dirichlet series (2^) for ({z) converge logarithmically. 



3 The Aitken Formula, Wynn's Epsilon Algorithm, and 
Related Transformations 

It will now be shown how Aitken's A" formula can be constructed by assuming that the 
truncation error consist of a single exponential term according to 



= ,s + cA'^ 



c/0, \X\^l, ne 



(3.1) 



For < |A| < 1, this sequence converges to its limit s, and for |A| > 1, it diverges away from 
its generalized limit or antilimit s. Thus, if this sequence converges, it converges linearly 
according to ( 2.12 ). 

By considering s, c, and A in (3J) as unknowns of the linear system Sn+j = s + cA""*"^ 
with j = 0, 1, 2, a sequence transformation can b e co nstructed which is able to determine 



the (generalized) limit s of the model sequence (3.1) from the numerical values of three 
consecutive sequence elements s„, s„+i and Sn-f-2- A short calculation shows that 



A\ 



(") 



[ASr. 



n e No: 



(3.2) 



produces the (generalized) limit s of the model sequence (3.1) according to A" = s. 
Alternative expressions for A" are discussed in pSl Section 5.1]. The forward difference 
operator A in (3.2) is defined according to AG{n) = G{n + 1 — G{n). 

The power and practical usefulness of Aitken's A^ formula is limited since it is designed 
to eliminate only a single exponential term. However, the output data Ai can be used as 
input data in the A^ formula (p^). Hence, the A^ process can be iterated, which leads to 
the following nonlinear recursive scheme |28, Eq. (5.1-15)]: 



a: 



(n) 



a'-'^^ ~ A 



(") 



[A^L' 



eNo, 

(")l2 



am!"' 



fc, n G No . 



(3.3a) 
(3.3b) 



In the case of doubly indexed quantities like A^^ , it will always be assumed that A only 
acts on the superscript n but not on the subscript fc according to AA:^ = v4|." 



A 



(n) 



There is an extensive literature on Aitken's A^ process and its iteration. Its numerical 
performance was studied in [g8|,|5^,^. It was also discussed in articles by Lubkin [B0|, 
^ Clark, Gray, and Adams @, Cordelher g], Hilhon |6|, 

p8[ , and Weniger [g8,53,S2], or in books by Baker and Graves- 
, Brezinski and Redivo Zaglia ]E§, Delahaye jEl, Walz pi. 



Shanks M, Tucker |83,|8| 
Jurkat [ ^7[ , Bell and Phillips 
Morris i|32|, Brezinski [^|2( 

and Wimp [ |l5[ . A multidimensional generalization of Aitken's transformation to vector 
sequences was discussed by MacLeod [90]. Modifications and generalizations of Aitken's A^ 
process were proposed by Drummond ]91], Jamieson and O'Beirne ]Q, Bj0rstad, Dahlquist, 
and Grosse [p3| , and Sablonniere |94]. The iteration of other sequence transformations is 
discussed in [p5| . 
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An obvious generalization of the model sequence (3J) would be the following model 
sequence which contains k exponential terms: 



fc-i 
i=0 



|Ao| > |Ai| > ... > |Afe_i 



(3.4) 



Although the A^ process (3^) is by construction exact for the model sequence (p^l), its 
iteratio n (|3.3| ) is not exact for the model sequence (3.4). Instead, this is - as first shown by 
Wynn }j6| and later extended by Sidi BT 



true for Wynn's epsilon algorithm JlC 



An) 



= 0, 



An) 



An) 






l/[' 



Xn+l) 



neNo, 

Anh 



k,n eNq. 



(3.5a) 
(3.5b) 



An) 



Only the elements Cjfc with even subscripts provide approximations to the limit s of the se- 
quence {sn}^o to be transformed. Efhcient recursive schemes based on the moving lozenge 
technique introduced by Wynn [Q are discussed in ^, Section 4.3]. 



A short calculation shows Ai = £2 ■ 



For k > 1, A^^ and e," are in general different, 



but they have similar properties in convergence acceleration and summation processes. 

If the input data s„ for Wynn's epsilon algorithm are the partial sums (1.3) of the 
(formal) power series for some function f{z) according to s„ = fn{z), then it produces Fade 
approximants according to |T^ 



An) 



+ fc/fc]/(z), fc,neNo. 



(3.6) 



Since the epsilon algorithm can be used for the computation of Fade approximants, it 
is discussed in books on Fade approximants such as 1 52| , but there is also an extensive 
literature dealing directly with it. In Wimp's book |15, p. 120], it is mentioned that over 
50 articles on the epsilon algorithm were published by Wynn alone, and at least 30 articles 
by Brezinski. As a fairly complete source of references Wimp recommends Brezinski's first 
book |jl9]. However, this book was published in 1977, and since then many more articles 
dealing with the epsilon algorithm have been published. Thus, a detailed bibliography 
would be beyond the scope of this article. Moreover, the epsilon algorithm is not restricted 
to scalar sequences but can be generalized to cover for example vector sequences. A very 
recent review can be found in ||99|]. 

Aitken's iterated A^ process (3^) as well as Wynn 's epsilon algorithm (3^) are powerful 



accelerators for sequences which according to ( 2.12 ) converge linearly, and they are also 
able to sum many alternating divergent series. However, they fail completely in the case 
of logarithmic convergence (compare for example IJO^, Theorem 12]). Moreover, in the case 
of divergent power series whose series coefficients grow more strongly than factorially. Fade 
approximants or equivalcntly Wynn's epsilon algorithm either converge too slowly to be 
numerically useful [ 100 , |l01| or are not at all able to accomplish a summation to a unique 
finite generalized limit [102|. 

Brezinski showed that the inability of Wynn's epsilon algorithm of accelerating logarith- 
mic convergence can be overcome by a suitable modification of the recursive scheme (3.5). 
This leads to the so-called theta algorithm pSl : 



An) 
'2k+l 

a{n) 
'2k+2 



0. 



An) 



An+l) 
'2k-l 

An+l) 
'2k 



2k J 



l/[M 

[A^r^)] 



n e No, 
, fc,neNo: 



^ ^2k+l 



k,n £ 



(3.7a) 
(3.7b) 

(3.7c) 
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As in the case of Wynn's epsilon algorithm ( |3.5[ ), only the elements 1!}^' with even subscripts 
provide approximations to the (generalized) limit of the sequence to be transformed. 

The theta algorithm was derived with the intention of overcoming the inability of the 
epsilon algorithm to accelerate logarithmic convergence. In that respect, the theta algorithm 



was a great success. Extensive numerical studies of Smith and Ford |79, 103 1 showed that 
the theta algorithm is not only very powerful, but also much more versatile than the epsilon 
algorithm. Like the epsilon algorithm, it is an efficient accelerator for linear convergence 
and it is also able to sum many divergent series. However, it is also able to accelerate the 
convergence of many logarithmically convergent sequences and series. Further details as 
well as additional references can be found in |23, Section 2.9] or in [GSl Sections 10 and 11]. 
As for example discussed in 1|95| , new sequence transformations can be constructed by 
iterating explicit expressions for sequence transformations with low transformation orders. 
This approach is also possible in the case of the theta algorithm. A suitable closed-form 
expression, which may be iterated, is 1|28| , Eq. (10.3-1)] 

An) _ ^ [A5„] [ASn+l] [A^g„+i] 

^2 - Sn+1 - TT TT-r^ i FT rf-TT T , U € ^Q ■ (,3.8) 

[As„+2j l.A^s„J - [As„J ]_A^s„+iJ 
Its iteration yields the following nonlinear recursive scheme [g8[ Eq. (10.3-6)]: 

Jt^ = s„, neNo, (3.9a) 

/(") - 7("+i) [AJfc^")] [AJ'i"+^'] [A^jt^''>] kneNn (3 9b) 

[Ajt-^^][A^jn-[Ajt^][A^jt-^^]' fc,.eNo.(3.9b) 

The iterated transformation J7^ has similar properties as the theta algorithm from which 
it was derived: Both are very powerful as well as very versatile. J^ is not only an effective 
accelerator for linear convergence as well as able to sum many divergent series, but it is also 
an effective accelerator for logarithmic convergence 28, p4, 95, 104-107|. 



Richardson Extrapolation, Wynn's Rho Algorithm, 
and Related Transformations 



As long as p in ( 2.12| ) is not too close to one, the acceleration of linear convergence is 



comparatively simple. With the help of Germain-Bonne's formal theory of convergence ac- 



celeration ]10S] and its extension |28, Section 12], it can be decided whether a sequence 
transformation is capable of accelerating linear convergence or not. Moreover, several se- 
quence transformations are known that accelerate linear convergence effectively. 

Logarithmic convergence leads to more challenging computational problems than linear 
convergence. An example is the Dirichlet series ( ^.6[ ) for the Riemann zeta function. As 
already discussed in Section g, its convergence of can become so slow that the evaluation of 
C{z) by successively adding up the terms {v + 1)^^ is practically impossible. 



There are also principal theoretical problems. Delahaye and Germain-Bonne ]109, 110 
showed that no sequence transformation can exist which is able to accelerate the convergence 
of all logarithmically convergent sequences. Consequently, an analogue of Germain-Bonne's 



beautiful formal theory of the acceleration of linear convergence ]108] and its extension pq. 
Section 12] cannot exist, and the success of a convergence acceleration process cannot be 
guaranteed unless additional information is available. 

Nevertheless, many sequence transformations are known which work at least for suit- 
ably restricted subsets of the class of logarithmically convergent sequences. Examples are 
Richardson extrapolation [Q, Wynn's rho algorithm Q and its iteration ]g^. Section 6], 



10 Ernst Joachim Weniger 



Osada's modification of the rho algorithm |111|, and the modification of the A process by 



Bj0rstad, Dahlquist, and Grosse |9^. However, there is considerable evidence that sequence 
transformations speed up logarithmic convergence less efficiently than linear convergence 
(see, for example, the discussion in |52| , Appendix A]). 

Another disadvantage of logarithmic convergence is that serious numerical instabilities 
are much more likely. A sequence transformation accelerates convergence by extracting 
and utilizing information on the index-dependence of the truncation errors from a finite 
set of input data. This is normally accomplished by forming higher weighted differences. 
If the input data are the partial sums of a strictly alternating series, the formation of 
higher weighted differences is a remarkably stable process, but if the input data all have the 
same sign, numerical instabilities are quite likely. Thus, if the sequence to be transformed 
converges logarithmically, numerical instabilities are to be expected, and it is usually not 
possible to obtain results that are close to machine accuracy. 

In some cases, these instability problems can be overcome with the help of a condensation 
transformation due to Van Wijngaarden, which converts input data having the same sign 
to the partial sums of an alternating series, whose convergence can be accelerated more 



effectively. The condensation transformation was first mentioned in [112, pp. 126 - 127] and 



only later published by Van Wijngaarden |113 . It was used by Daniel |114] in combination 
with the Eulcr transformation [pj, and recently, it was rederived by Pelzl and King |115[ . 
Since the transformation of a strictly alternating series by means of nonlinear sequence 
transformations is a remarkably stable process, it was in this way possible to evaluate 
special functions, that are defined by extremely slowly convergent monotone series, not only 
relatively efficiently but also clo se to machine accuracy (p2| , or to perform extensive quantum 



electrodynamical calculations |116]. Unfortunately, the use of this combined nonlinear- 
condensation transformation is not always possible: The conversion of a monotone to an 
alternating series requires that terms of the input series with large indices can be computed. 
For the construction of sequence transformations, which are able to accelerate loga- 
rithmic convergence, the standard interpolation and extrapolation methods of numerical 



mathematics |117] , |ll8[ are quite helpful. For that purpose, let us postulate the existence of 
a function 5 of a continuous variable which coincides on a set of discrete arguments {a:„}5^Q 
with the elements of the sequence {s„}^o ^o be transformed: 

S{xn) = s„, neNo- (4.1) 

This ansatz reduces the convergence acceleration problem to an extrapolation problem. If 
a finite string s„, Sn+i, ■ ■ ■ , Sn+k of fc + 1 sequence elements is known, one can construct an 
approximation Sk{x) to S{x) which satisfies the k + I interpolation conditions 

Sk{xn+j) = Sn+j , n e No , < j < /c . (4.2) 

In the next step, the value of Sk{x) has to be determined for x — > a;oo- If this can be done, 
we can expect that Sk{xao) will provide a better approximation to the limit s — Soc of the 
sequence {sn}^=o than the last sequence element Sn+k used for the construction of Sk{x). 
The most common interpolating functions are either polynomials or rational functions. 
In the case of polynomial interpolation, it is implicitly assumed that the fc-th order approx- 
imant Sk{x) is a polynomial of degree k in x: 

Sk{x) ^ cq + cix + ■■■ + Ckx'' . (4.3) 

For polynomials, the most natural extrapolation point is x = 0. Accordingly, the interpola- 
tion points Xn have to satisfy the conditions 

Xq > Xi > ■ ■ ■ > Xrn > Xra+1 > • • ■ > , (4.4a) 

lim x„ = 0. (4.4b) 



n — >cxD 
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The choice x^c = implies that the approximation to the hmit is to be identified with the 



constant term cq of the polynomial (4.3) 



Several different methods for the construction of interpolating polynomials Sk{x) are 
known. Since only the constant term cq of a polynomial Sk{x) has to be computed and since 
in most applications it is desirable to compute simultaneously a whole string of approximants 
5o(0), 5i(0), 52(0), ... , the most economical choice is Neville's scheme ]119| ] for the recursive 
computation of interpolating polynomials. If we set a; = 0, Neville's algorithm reduces to 
the following 2-dimensional linear recursive scheme pO| , p. 6]: 

Aft^ = s„, neNo, (4.5a) 

_^(„) ^ xj^ ^-x„+fc+rA/; fc,nGNo. (4.5b) 

Xn Xn+k+1 

In the literature, this variant of Neville's scheme is called Richardson extrapolation |^. 

There are functions that can be approximated more effectively by rational functions than 
by polynomials. Consequently, at least for some sequences {s„}^o rational extrapolation 
should give better results than polynomial extrapolation. Let us therefore assume that 
Sk{x) can be expressed as the ratio of two polynomials of degrees I and to, respectively: 

'^'-^"^^ bo + b,x + b,x^ + ... + K..- ^ fc,Z,TOeNo. (4.6) 

This rational function contains I + m + 2 coefficients oq, . . . , a; and b^, . . . , 6,„. However, 
only / + TO, + 1 coefficients are independent since they are determined only up to a common 
nonvanishing factor. Usually, one requires either &o = 1 or &„ = 1. Consequently, the 
k + 1 interpolation conditions ( |4.2| ) will determine the coefficients qq, . . . ,ai and bo, . . . ,bjn 
provided that k — I + m holds. 

The extrapolation point Xoo = is again an obvious choice. In this case, the interpolation 



points {xn}^=Q in (4.6) have to satisfy (4.4), and the approximation to the limit is to be 



identified with the ratio ao/60 of the constant terms in (4.6). 

If Z = TO holds in ( |4.6| ), extrapolation to infinity is also possible. In that case the 
interpolation points {a;n}J5°^o have to satisfy 

< xo < xi < • • • < a;„i < Xm+i < • ■ ■ , (4.7a) 

lim Xn = 00 , (4-7b) 

n — ^00 



and the approximation to the limit is to be identified with the ratio ai/bi in (|4j 

As in the case of polynomial interpolation, several different algorithms for the com- 
putation of rational interpolants are known. A discussion of the relative merits of these 
algorithms as well as a survey of the relevant literature can be found in [p3|. Chapter III] . 

The most frequently used rational extrapolation technique is probably Wynn's rho al- 
gorithm |17|: 

p'lI = 0, p[,") = s„, neNo, (4.8a) 

Pk+1 ^ /'fc-1 + -fTTTT) W' k,neno. (4.8b) 

Pk -Pk 

Formally, the only difference between Wynn's epsilon algorithm (|3.5|) and Wynn's rho al- 



gorithm is the sequence {xnj^o of interpolation points which have to satisfy (4.7). As 
in the case of the epsilon alj 
approximations to the limit. 



in) 

in the case of the epsilon algorithm, only the elements p^^ with even subscripts provide 
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In spite of their formal similarity, the epsilon and the rho algorithm have complementary 
features. The epsilon algorithm is a powerful accelerator for linear convergence and is also 
able to sum many divergent alternating series, whereas the rho algorithm fails to accelerate 
linear convergence and is not able to sum divergent series. However, it is a very powerful 
accelerator for many logarithmically convergent sequences. 

The properties of Wynn's rho algorithm are discu ssed in books by Brezinski [l^,|2y] and 
Wimp |l^. Then, there is an article by Osada |120| discussing its convergence properties, 
but otherwise, relatively little seems to b e kn own about its theoretical properties. 

As in the case of Aitken' s A ^ formula (3_^), iterated transformations can be constructed. 
For A; = 1, we obtain from (4 



(n) 
P2 



»n+l 



(a;„+2 - a;„)[As„+i][As„ 



[Ax„+i][As„] - [Aa:„][As„+i] 
This expression can be iterated yielding |p3, Eq. (6.3-3)] 



n e Nq. 



W, 



(n) 



neNo, 



(4.9) 



(4.10a) 



W, 



fc+i ~ 



W, 



(n+l) 



{Xn 



+2k+2 



-a;„)[AWr+'^][AWr] 



;(«)! 



{Xn+2k+2 - Xn+l)[AWi"'>] - {Xn+2k+l - X„) [AW^"+'^] ' 

k,neNo. (4.10b) 

(n) 

This is not the only possibility of iterating pj • However, the iterations derived by Bhowmick, 



Bhattacharya, and Roy |121| are significantly less efhcient than W^. , which has similar 



(") 



properties as Wynn's rho algorithm P8| , |95| . 

The main practical problem with sequence transformations based upon interpolation 
theory is that for a given sequence {sn}'^=o one has to find suitable interpolation points 
{xn}'^=Q that produces good results. For example, the Richardson extrapolation scheme 
(4.5) is normally used in combination with the interpolation points Xn — ^/{n + P) with 



Eq.V-3-20)]), 



/3 > 0. Then, Aff. possesses a closed form expression (see, for example, [g2[ Lemma 2.1, p. 
313] or ~ 



A/-^ = Ai"'(/3,.„) = (-l)^-^(-l)^ 



3=0 



{(3 + n + j)' 



■ Sn+j , 



fc, n e No ; 



and the recursive scheme (O) assumes the following form |g8|, Eq. (7.3-21)]: 



a[,"^(/3,s„) 



n e No 



(4.11) 



(4.12a) 



i"+^H/3,s„+i) + ^AAi")(/3,s„), fc,neNo. (4.12b) 



Similarly, Wynn's rho algorithm ( [4.8[ ) and its iteration (4.10) are normally used in com- 
bination with the interpolation points a;„ = 
example [Bsl Eq. (6.2-4)]) 



1, yielding the standard forms (see for 



P-i 

(«) 
Pk+i 



= 0, 



(n) 

Po 



Pk-1 



fc+1 



(n+1) 
Pk 



in) ' 
Pk 



k,n eNo, 



and ]^, Section 6.3] 



W, 



(n) 



neNo, 



Ht = wl"+'^ 



(2fc + 2)[A>V^"+^^][A>V^")] 
(2fc+l)A2w[."^ 



fc,n eNo 



(4.13a) 
(4.13b) 

(4.14a) 
(4.14b) 
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Many practically relevant logarithmically convergent sequences {s„}5^q can be represented 
by series expansions of the following kind: 



s + (n + /3)-"^c,/(n + /3)^ 



n e Nq. 



(4.15) 



Here, a is a positive decay parameter and /? is a positive shift parameter. In feq, Theorem 



14-4], it was shown that the standard form ( 4.12 ) of Richardson extrapolation accelerates 
the convergence of sequences of the type of ( [1.15| ) if a i s a positive integer, b ut f ails if a is 



nonintegral. This is also true for the standard form ( 4.13 ) of the rho algorithm |111, Theorem 
3.2] . In the case of the iteration of Wynn's rho algorithm, no rigorous theoretical result seems 
to be known but there is considerable empirical evidence that it only works if a is a positive 
integer JH, Section 14.4]). 



If the decay parameter a of a sequence o f th e type of ( 4.15 ) is known, then Osada's 
variant of Wynn's rho algorithm can be used |111, Eq. (3.1)]: 



P-i 



= 0, 



-(") 
Po 





(4.16a) 
(4.16b) 


_(«+!) _(n) ' /^,'^CiJu. 
Pk ~ Pk 



-(n) -(n+1) 

Pk+1 = Pk-1 



Osada a lso d emonstrated that his variant accelerates the convergence of sequences of the 
type of ( 4.15 ) for arbitrary a > 0, and that the transformation error satisfies the following 
L, Theorem 4]: 



asymptotic estimate 



HI 



-(") 

P2k 



= o( 



,-a-2k\ 



(4.17) 



Osada's variant of the rho algorithm can be iterated. From (4.16) we obtain the following 



expression for pj i^i terms of s„, s„+i, and Sn+2'- 

(a + 1) [As„][As„+i] 



-(») 

P2 



''n+1 



[A2 



ne No 



(4.18) 



If the iteration is done in such a way that a is increased by two with every recursive step, 
we obtain the following recursive scheme |^, Eq. (2.29)] which was originally derived by 
Bj0rstad, Dahlquist, and Grosse ||9|, Eq. (2.4)]: 



W, 



(») 



W 



(n) 
fc+l 



= w 



in+1) 



71 e No ; 

{2k- 



i("+i) 



(")l 



1) [Awri [Awri 



{2k + a) 



A2Wfc 



<") 



(4.19a) 
k,neNo. (4.19b) 



Bj0rstad, Dahlquist, a nd G rosse also showed that Wj. accelerates the convergence of se- 
quences of the type of ( 4. K ) , and that the transformation error satisfies the following asymp- 
totic estimate M, Eq. (3.1)] 



wi"^ 



oo 



,-Q-2/c\ 



(4.20) 



The explicit knowledge of the decay parameter a is crucial for an application of the 
transformations ( 4.16| ) and ( 4.1S| ) to a sequence of the type of (4.15). An approximation to 
a can be obtained with the help of the following nonlinear transformation, which was first 
derived in a somewhat disguised form by Drummond | |9l| and later rederived by Bj0rstad, 
Dahlquist, and Grosse p3[: 



T — 



[A2s„] [A^Sn+l] 



[As„+i] [A2 



'n+lj 



[As, 



n+2 



[A2 



1 , n G No . 



(4.21) 
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T„ is essentially a weighted A"^ method, which implies that it is potentially very unstable. 
Thus, stability problems are likely to occur if the relative accuracy of the input data is low. 
Bj0rstad, Dahlquist, and Grosse ||9^, Eq. (4.1)] also showed that 



r„ + 0(l/n2), n^oo, (4.22) 



if the elements of a sequence of the type of (4.15) are used as input data. 



5 Transformations with Explicit Remainder Estimates 

As discussed before, the action of a sequence transformation corresponds at least concep- 
tually to the construction an approximations to the truncation error r„, whose elimination 
from s„ leads to an acceleration of convergence or a summation. However, the remainders 
r„ may depend on n in a very complicated way, and the construction of approximations to 
the r„ and their subsequent elimination can be very difficult. In some cases, however, struc- 
tural information on the n-dependence of the r„ is available. For example, the truncation 
error of a convergent series with strictly alternating and monotonously decreasing terms is 
bounded in magnitude by the first term not included in the partial sum and has the same 
sign as this term m, p. 132]. The first term neglected is also the best simple estimate for the 
truncation error of a strictly alternating nonterminating hypergeometric series 2Fo{a, (3; —x) 
with a, /3, a; > ]122| , Theorem 5.12-5], which diverges for every nonzero argument x. Such 
an information should be extremely helpful. Unfortunately, the sequence transformations 
considered so far cannot benefit from it. 

Structural information of that kind can be incorporated into the transfo rmat ion process 
via explicit remainder estimates {^nj^O' ^^ ^^ ^^^ ^^^^ done by Levin |l23|] . For that 
purpose, let us assume that the remainders r„ of a sequence {s„}^q can be partitioned 
into a remainder estimate a;„ multiplied by a correction term Zn according to r„ = w„z„. 
The remainder estimates are chosen according to some rule and may depend on n in a 
very complicated way. If the remainder estimates {a;„}^o correctly describe the essential 
features of the remainders {'"nj^oi then the {^nj^g should depend on n in a relatively 
smooth way. Of course, we tacitly assume here that the products tOnZn are in principle 
capable of producing sufficiently accurate approximations to the remainders. 

Thus, we have to find a sequence transformation that is exact for the model sequence 

Sn = S + U)nZn, n S Nq , (5.1) 

where the remainder estimates {wnjj^o ^^^ assumed to be known. The principal advan- 
tage of this approach is that only approximations to the correction terms {Zn]'^=o have to 
be determined. If good remainder estimates can be found, the determination of Zn and 
the subsequent elimination of LOnZn from s„ often leads to clearly better results than the 
construction and elimination of other approximations to r„ . The explicit utilization of in- 
formation contained in remainder estimates is the major difference between the sequence 
transformations discussed in this Section and the other transformations of this article. 



The model sequence (5.1) has another indisputable advantage: There is a systematic 
way of constructing a sequence transformation which is exact for this model sequence. Let 
us assume that a linear operator T can be found which annihilates the correction term Zn 
according to T(z„) = 0. Then, a sequence transformation, which is exact for the model 
sequence ( |5.lD , can be obtained by applying T to [s„ — s]/a;„ — Zn- Since T annihilates z„ 
and is by assumption linear, the following sequence transformation T is exact for the model 
sequence (|I|) [||, Eq. (3.2-11)]: 

T(s„,t^„) = ^j— - — - = s. (5.2) 

T(l/w„) 



Nonlinear Sequence Transformations 



15 



Originally, the construction of sequence transformations via annihilation operators wa s in- 
troduced in |28, Section 3.2] in connection with a rederivation of Levin's transformation |l23t| 
and the construction of some other, closely related sequence transformations ||28|, Sec tions 7 - 
9]. Later, this operator ap proach w as also used and discuss ed b y Brezinski |q,|9|, |l24| , B rezin - 
ski and Redivo Zaglia ^5, 125, 126], Brezinski and Salam |127|, Brezinski and Matos |l28[ , 



Matos [|l2g, and Homeier [g9| 



If the annihilation operator T in (5^) is based upon the finite difference operator A, 
simple and yet very powerful sequence transformations are obtained |£8|, Sections 7-9]. 
As is well known, A*^ annihilates a polynomial Pk-i{n) of degree fc — 1 in n. Thus, the 
correction terms should be chosen in such a way that multiplication of Zn by some Wk (n) 
yields a polynomial Pk~i{n) of degree fc — 1 in n. 

Since A'^Wfc(n)z„ = /^^Pk-i{n) = 0, the weighted difference operator T = A'^Wki'n) 
annihilates Zn, and the corresponding sequence transformation (5.2) is given by the ratio 



7^^"^(wfc(n)|s„,w„) = 



A''{wk{n)sn/uJn} 
A''{wk{n)/ujn} 



(5.3) 



Several different sequence transformations are obtained by specializing Wk[n). For instance, 
Wk{n) = {n + Q^^'^ with C > yields Levin's sequence transformation |123|]: 



^k (CjSnj^n) — 
k 



Afc{(n + C)^-^gn/t^n} 



k\ {C,+n + 3) 



fc-i 



^n+j 



jj iC + n + k)'' 1 ujn+j 



E i-^y 



i=o 



k\ iC + n + j) 



fe-i 



(5.4) 



jJ iC + n + k)'' 1 ujn+j 



The shift parameter C has to be positive to allow n = in (5.4). The most obvious choice 
is C = 1- According to Smith and Ford |79, 103|, Levin's transformation is among the most 
powerful and most versatile sequence transformations that are currently known. 

The unspecified weights Wk{n) in (5.3) can also be chosen to be Pochhammer symbols 
according to Wkin) — [n + C)k-i with C > 0, yielding [En, Eq. (8.2-7)] 



'^r (c,s„,tj„) = 



k 

E 



(-1)^- 



A''{{n + C)k~l Sn/^n} 

Af^iin + Ok-i/uJn} 
k\ {C + n + j)k-i Sn+j 



jJ {C + n + k)k-i ujn+j 



j=o 



i-iy 



iC + n + j)k-i 



1 



(5.5) 



(C- 



k)k- 



^n+j 



As shown in several articles, this transformation is very effective if strongly divergent al- 
ternating series are to be summed |2^, po|, |49|-p^, p9H65|, pTJ-pE , 130 - 132 1. Again, the most 
obvious choice for the shift parameter is C = 1- 

Levin's transformation £)," (C,s„,w„) is by construction exact for the model sequence 
Sn = s + ujn "^j^o Cj/{n + C,)'K Consequently, Levin's transformation should work well if the 
ratio [s„ — s]/cl'„ can be expressed as a power series in l/(n + C). Similarly, SfP' (C, s„, a;„) 
is by construction exact for the model sequence s„ = s -1- ^n^j^fj Cj/(n -f ()j, which is 

a truncated factorial series |2|, Eq. (8.2-1)]. Accordingly, S\! [C,,Sn,^n) should give good 
results if the ratio [s„ — s\/u!n can be expressed as a factorial series. 
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Given a suitable sequence {(^nlj^o of remainder estimates, the sequence transformations 



6^^ and 5{."^ 



can be computed via their exphcit expressions (5.4) and (5.5). How ever, it i s 
usuaUy more effective to compute the numerator and denominator sums in (5.4) and (5.5) 
with the help of three-term recursions (28[ Eqs. (7.2-8) and (8.3-7)]. 

The explicit incor pora tion of the information contained in the remainder estimates makes 
the transformations (5^) and (5^) potentially very powerful. However, this is also their 
major potential weakness. If remainder estimates can be found such that the products 
^nZn provide good approximations to the remainders. Levin-type sequence transformations 
should work very well. If, however, good remainder estimates cannot be found, sequence 
transformations of th at k ind perform poorly. Consequently, a fortunate choice of the re- 
mainder estimates in (x4) and ( |5.5| ) is of utmost importance since it ultimately determines 
success or failure. 

The differen ce o perator A is linear. Consequently, the effect of the general sequence 
transformation (5.3) on an arbitrary sequence {sn}$^o with (generalized) limit s can be 
expressed as follows: 



rt''\wu{n) 



A''{wk{n){Sn - s)/uJn} 

A'^{wfc(n)/w„} 



(5.6) 



Obviously, ^ (wk (n) Sn, ^n) converges to s if the ratio on the right-hand side can be made 
arbitrarily small. This is the case if A'^Wkin) annihilates [s„ — s]/ujn more effectively than 
l/u!„. Thus, one should try to find remainder estimates such that the ratios [s„ — s]/w„ 
depend on n only weakly: 



[Sn-s]/uJn = C + 0{l/n), C^O, 



(5.7) 



This asymptotic condition does not determine the remainder estimates uniquely. Therefore, 
it is at least in principle possible to find for a given sequence an unlimited variety of different 
remainder estimates, which all satisfy this asymptotic condition. 



On the basis of heuristic and asymptotic arguments. Levin |123| suggested the following 
simple remainder estimates, which nevertheless often work remarkably well: 



LUr, 



= (C + ri)As„_i 
= As„_i , 

_ A5„_lA3„ 

As„_i - As„ 



C>o, 



(5.8) 
(5.9) 

(5.10) 



The use of these remainder estimates in ( p.4[ ) yields Levin's u, t, and v transformation, 
respectively ||, Eqs. (7.3-5), (7.3-7), and (7.3-11)]: 



,(«)/ 



4")(c,.„) = 4"^(c,s„,A.„_i), 

4"'(C, s„) = 4"^(C, Sn, (As„_iAs„)/(Asn-i " As„)) 
Later, Smith and Ford ]103[ suggested the remainder estimate 

LJn = As„ , 

which yields Levin's d transformation pq, Eq. (7.3-9)]: 



(5.11) 
(5.12) 
(5.13) 

(5.14) 



dr(C,.„) = £r(C,.„,A.„). 



(")/ 



(5.15) 
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Levin's t and d transformations are capable of accelerating linear convergence and they 
are particularly efficient in the case of alternating series, but fail to accelerate logarithmic 
convergence. Levin's u and v transformation are more versatile since they not only accelerate 
linear convergence but also many logarithmically convergence sequences and series. A more 
detailed discussion of the properties of these remainder estimates, some generalizations, 
additional heuristic motivation, and a description of the types of sequences, for which these 
estimates should be effective, can be found injES], Secti ons 7 and 12 - 14]. The main 
advantage of the simple remainder estimates (5^) - ( 5.10| ) and ( 5.14 ) is that they can be 



used in situations in which only the numerical values of a few elements of a slowly convergent 
or divergent sequence are known. 

The remainder estima tes (p!q) - ( ^.10 ) and ( 5.14 ) can also be used in the case of the 
sequence transformation (U), yielding ^, Eqs. (8.4-2), (8.4-3), (8.4-4), and (8.4-5)] 



yric, 



5r(C,s„,(C + n)As„_i) 



J") 



r{C,Sn) = 5r(C,Sn,A5„_i), 






= 5^"^ (C, s„, (As„_i As„)/(As„_i - As„)) , 
= 5^")(C,s„,As„). 



(5.16) 
(5.17) 
(5.18) 
(5.19) 



Alternative remainder estimates for the sequence transformations (5.4) and (|5.5| ) are dis- 
cussed in [ 28 , 58[ . Convergence properties of the sequence tran sformations ( 5A ) and ( |5.5| ) 
and their variants were analyzed in articles by Sidi 133 -135], in |28, Sections 12 - 14], 
in |6|, Section 4] , and also in ]g9[ . 

Other sequence transformations, which are also special cases of the general sequence 
transformation ( ^.2|) , can be found in ]g^. Sections 7 - 9] , in the book by Brezinski and Redivo 
Zaglia ]g^. Section 2.7], or in a recent review by Homeier [g^, which is the currently most 
complete source of information on Levin-type transformations. A sequence transformation, 
which interpolates between the transformations (5.4) and ( |5.5| ), was described in [ pl[ . 



6 Numerical Aspects 

As discussed before, sequence transformations try to accomplish an acceleration of conver- 
gence or a summation by detecting and utilizing regularities in the behavior of the elements 
of the sequence to be transformed. For sufficiently large indices n, one can expect that 
certain asymptotic regularities do exist. However, sequence transformations are normally 
used with the intention of avoiding the asymptotic domain, i.e., one tries to construct the 
transforms from the leading elements of the input sequence. Unfortunately, sequence ele- 
ments s„ with small indices n often behave irregularly. Consequently, it can happen that 
a straightforward application of a sequence transformation is ineffective and even leads to 
completely nonsensical results. In fact, one should not be too surprised that a strategy, 
which tries to avoid the asymptotic domain by extracting asymptotic information from the 
leading elements of an input sequence, occasionally runs into trouble. 

As a possible remedy, one should analyze the behavior of the input data as a function of 
the index and exclude highly irregular sequence elements - usually the leading ones - from 
the transformation process. This gives a much better chance of obtaining good and reliable 
transformation results. 

In view of the fact that irregular input data can never be excluded, one might expect that 
there are many references dealing with this topic. However, apart from a recent reference 
of my own ] p3| , I am only aware of an article by Gander, Golub, and Gruntz ]136] where 
it is shown that the convergence of extrapolations of iteration sequences can be improved 
by excluding the leading elements of the input sequence from the extrapolation process. 
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Otherwise, a discussion of the impact of irregular input data on convergence acceleration 
and summation processes seems to be part of the oral tradition only. 

A numerical process can only involve a finite number of arithmetic operations. Thus, 
a sequence transformation T can only use finite subsets of the type {s„,s„+i, . . . ,s„+;} 
for the computation of a new sequence element s^. Moreover, all the commonly used 
sequence transformations T can be represented by infinite sets of doubly indexed quantities 
T^" with fc,n e No that can be displayed in a two-dimensional array called the table of 
T. The superscript n always indicates the minimal index occurring in the finite subset of 

(n) 

input data used for the computation of T^ . The subscript k - usually called the order of 
the transformation - is a measure for the complexity of the transformation process which 
yields T^" . The elements T^" are gauged in such a way that Tq corresponds to an 
untransformed sequence element according to Tq = s„. An increasing value of k implies 
that the complexity of the transformation process increases. Moreover, I = l{k) in the 
substring {s„,s„+i,... , s„+i} also increases. This means that for every fc,n e No the 
sequence transformation T produces a new transform according to 

^fc" = "^{sruSn+l,--- ,Sn+l{k)) ■ (6-1) 

The relationship, which connects k and I, is specific for a given sequence transformation T. 
Let us assume that a sequence transformation T is to be used to speed up the convergence 
of a sequence {s„}^q to its limit s = Soo- One can try to obtain a better approximation to 
s by proceeding on an unlimited variety of different paths in the table of T. Two extreme 
types of paths - and also the ones which are predominantly used in practical applications 
- are order- constant paths JT"^" " } _„ with fixed transformation order k and increasing 

superscript, and index- constant paths {7j:"|_} with fixed minimal index n and increasing 
subscript. 

These two types of paths differ significantly. In the case of an order-constant path, a 
fixed number of ^ -|- 1 sequence elements {s„, s„+i, . . . s„+i} is used for the computation of 

(n) 

T^ , and the starting index n of this string of fixed length is increased successively until 
either convergence is achieved or the available elements of the input sequence are used up. 
In the case of an index- const ant path, the starting index n is kept fixed at a low value 
(usually n = Oorn = l) and the transformation order k is increased and with it the number 
of elements contained in the subset {s„, s„+i, . . . Sn^itk)}- Thus, on an index-constant path 
T^:" is always computed with the highest possible transformation order k from a given set 
of input data. 

In order to clarify the differences between order-constant and index-constant paths, 
let us consider the computation of Pade approximants with the help of Wynn's epsilon 



algorithm (3^). It follows from ( |3.6|) that the epsilon algorithm (3^) effects the following 
transformation of the partial sums (l^) of the power series for some function / to Pade 
approximants: 

{fniz),fn+l{z),...Jn+2k{z)} ^ [n + k / k] . (6.2) 

Thus, if we use a window consisting of 2fc + 1 partial sums fn+j{z) with < j < 2fc on an 
order-constant path and increase the minimal index n successively, we obtain the following 
sequence of Pade approximants: 

[n + k/k],[n-\-k-\-l/k],... ,[n -\- k + m/k], . . . . (6.3) 

Only 2fc -I- 1 partial sums are used for the computation of the Pade approximants, although 
many more may be known. Obviously, the available information is not completely utilized 
on such an order-constant path. 
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Then, the degree of the numerator polynomial oi [n + k + m/k] increases with increasing 
m (zNo, whereas the degree of the denominator polynomial remains fixed. Thus, these Pade 
approximants look unbalanced. Instead, it seems to be much more natural to use diagonal 
Pade approximants, i.e., Pade approximants with numerator and denominator polynomials 
of equal degree, or - if this is not possible - to use Pade approximants with degrees of the 
numerator and denominator polynomials that differ as little as possible. 

This approach has many theoretical as well as practical advantages. Wynn could show 
that if the partial sums /o(^), /i(^), ■ ■ • , f2n{z) of a Stieltjes series are used for the computa- 
tion of Pade approximants, then the diagonal approximant [n/n] provides the most accurate 
approximation to the corresponding Stieltjes function f{z), and if the partial sums fa{z), 
fi{z), • • • , /2n+i(^) are used for the computation of Pade approximants, then for z > 



either [n + 1/n] or [n/n + 1] provides the most accurate approximation [137|. A detailed 
discussion of Stieltjes series and their special role in the theory of Pade approximants can 
be found in p3. Section 5]. 

Thus, it is apparently an obvious idea to try to use either diagonal Pade approximants 
or their closest neighbors whenever possible. If the partial sums fo{z), fi{z), . . . , /^(z), 
. . . are computed successively and used as input data in the epsilon algorithm ( |3.5| ) , then 
we obtain the following staircase sequence in the Pade table [g8[ Eq. (4.3-7)], which exploits 
the available information optimally: 

[0/0], [1/0], [1/1], . . . [u/iy], [u + 1/u], [v + 1/j. + 1], . . . . (6.4) 

This example indicates that index- const ant paths are in principle computationally more 
efficient than order-constant paths since they exploit the available information optimally. 
This is also true for all other sequence transformations considered in this article. 

Another serious disadvantage of order-constant paths is that they cannot be used for the 
summation of divergent sequences and series since increasing n in the set {s„, s„+i, . . . , s„+/} 
of input data normally only increases divergence. Thus, it is apparently an obvious idea to 
use exclusively index-constant paths, and preferably those which start at a very low index 
n, for instance at n = or n = 1. This is certainly a good idea if all elements of the input 
sequence contain roughly the same amount of useful information. If, however, the leading 
terms of the sequence to be transformed behave irregularly, they cannot contribute useful 
information, or - to make things worse - they contribute wrong information. In such a 
case it is usually necessary to exclude the leading elements of the input sequence from the 
transformation process. Then, one should use either an order-constant path or an index- 
constant path with a sufficiently large starting index n. The use of an order-constant path 
has the additional advantage that the diminishing influence of irregular input data with 
small indices n should become obvious from the transformation results as n increases. 

In [|5^, g2|, |6^, ^ I did extensive quantum mechanical calculations for anharmonic os- 
cillators by summing strongly divergent perturbation expansions. The coefficients of these 
perturbation expansions are all rational numbers. Thus, they can be computed free of errors 
with the help of the exact rational arithmetics of a computer algebra system like Maple. 
In such a case, it is not only an obvious idea but in fact necessary to do the summation 
calculations on an index-constant path. The maximum transformation order in these sum- 
mation calculations was only limited by the number of perturbative coefficients that could 
be computed before the available memory of the computer was exhausted. 

The situation was completely different when I was involved in the extrapolation of quan- 
tum chemical oligomer calculations to the infinite chain limit |7l|-|7^. As discussed before, 
sequence transformations have to access the information stored in the later digits of the 
input data. However, the input data are produced by molecular ab initio programs which 
are huge packages of FORTRAN code (more than 100 000 lines of code). Normally, these 
programs operate in DOUBLE PRECISION which corresponds to an accuracy of 14 - 16 
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decimal digits (depending on the compiler). Unfortunately, this does not mean that their 
results have this accuracy. Some of the leading digits may have converged, and some of 
the trailing digits are corrupted because of internal approximations used in the molecular 
program or because of inevitable rounding errors. Thus, in most cases there is only a rel- 
atively narrow window of digits that can provide useful information for the transformation 
process. Moreover, the complexity of the calculations done in such a molecular program 
makes it impossible to obtain realistic estimates of the errors by the standard approaches 
of numerical mathematics. In such a case, it is recommendable to do the extrapolations on 
order-constant paths with low transformation orders, because high transformation orders 
can easily lead to meaningless extrapolation results. Further details can be found in ||73|. 

It may look like an obvious idea to use the results of a transformation process as input 
data for another sequence transformation. At least from a conceptual point of view, this is 
essentially the same as iterating a low order transformation like Aitken's A^ formula (|]3). 
If, however, the input data for the second transformation were obtained on an index-constant 
path by the first transformation, then I see basically two problems. The law, which governs 
the convergence of the first transformation, may be unknown and/or hard to find. Thus, it 
may be difficult to find a suitable second transformation. The second and - as I feel - more 
serious restriction is numerical in nature. The trailing digits of the input data for the second 
transformation may be largely corrupted by the first transformation. Thus, they cannot 
provide the information needed for a further improvement of convergence. There are ar ticle s 
which describe successful attempts of combining different sequence transformations ||7C, 138 1. 
However, I suspect that many more attempts were not reported in the literature because 
they failed to accomplish something substantial. 



7 Outlook 



If a review article is to written and if there is only a limited amount of space available, 
there is always the danger that the author emphasizes those aspects he knows particularly 
well, whereas other aspects are not treated as thoroughly as they probably should. This 
is of course also true for this review. So, I will now try to give a more balanced view by 
mentioning some additional applications of sequence transformations. 

In this review, exclusively the transformation of sequences of real or complex numbers 
was treated. However, sequence transformations can also be formulated for vector or matrix 
problems. Sequence transformations for vector sequences are for instance treated in the 
books by Brezinski and Redivo Zaglia pq. Sect ion 4] and Brezinski |124], or in a rticles by 
MacLeod @, Graves-Morris and Saff p9|p0|, Si di p| . Smith, Ford, and Sidi p^ , p3| , 
Sidi and Bridger JTiif, Jbilou and Sadok [145|,146[, Osada ll4M49f, Brezinski and Sadok 



[ iSOt , Matosj|5l|, Graves-Morris ||l5|,|l5||, Graves-Morris and Roberts ||l5|,^55|, Brezinski 
and Salam |127], Homeier, Rast, and Krienke ]156[ |, Salam |157-159|, Graves-Morris and Van 
Iseghem |16C], Roberts 161, 162|, and Graves-Morris, Roberts, and Salam [ p9[ . 

In the practice of the author, convergence acceleration and summation of infinite series 
has dominated. However, sequence transformations can also be very useful in the context 
of numerical quadrature, in particular if an oscillatory function is to be integrated over 
a semiinfinte interval. The use of sequence transformations in connection with numerical 



quadratures is fo r ins tance discussed in a book by Evans] 1 63 1, or in articles by Sidi [ 164 



Levin and Sidi |172|, Levin |l73[, Greif and Levin [174|, Safouhi and Hoggan [ 175 
Safouhi, Pinchon, and Hoggan ||l79t[ , and Safouhi [180,181[. 
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So far, sequence transformations were typically used to obtain better approximations to 
the limits of sequences, series, or numerical quadrature schemes. However, it is possible to 
use Fade approximants or other sequence transformations to make predictions for unknown 
perturbation series coefficients. For example, in some subfields of theoretical physics it is 
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extremely difficult to compute more than just a few perturbative coefficients, and even these 
few coefficients are usually affected by large relative errors. Thus, sequence transformations 
- mainly Pade approximants - were used to make predictions for unknown coefficients or 
to check the accuracy of already computed coefficients. Further details as well as many 
references can be found in |£^,^,^, 182|. In the majority of these predictions calculations. 



rational expressions in an unspecified symbolic variable were constructed, which were then 
expanded in a Taylor polynomial of suitable length with the help of a computer algebra 
program like Maple or Mathematica. However, in the case of Aitken's iterated A^ process 



(3.3), of Wynn's epsilon algorithm ( p.5[ ), and of the iteration ( p.9| ) of Brezinski's theta 
algorithm explicit recursive schemes for predictions were recently derived |p9| . With the help 
of these recursive schemes, it is possible to make predictions for power series coefficients with 
very large indices. In this way, it was possible to gain some insight into the mathematical 
properties of a perturbation expansion for a non-Hermitian PT-symmetric Hamiltonian 



|83| 



I hope that this review could show that sequence transformations are extremely useful 
numerical tools, and that there is a lot of active research going on, both on the mathematical 
properties of sequence transformations as well as on their application in applied mathematics 
and in the mathematical treatment of the sciences. Moreover, I am optimistic and expect 
further progress in the near future. 
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